Proactive Storage Health Management to Reduce Data Center Downtime
نویسندگان
چکیده
The project adopts machine learning algorithms to perform proactive storage health management to reduce data center down time. Specifically, we develop reliable algorithms that can predict disk drive failures for timely replacement. The dataset is a collection of SMART disk drive attributes provided by BackBlaze. To construct compact time-series for model development, three techniques are used to process SMART attribute data, including time-series averaging, feature selection, and class balancing. Models based on four algorithms, including logistic regression (LR), support vector machine (SVM), extreme gradient boosting (XGBoost), and recurrent neural network (RNN), are developed and evaluated using prediction accuracy and F-score along with precision and recall. The baseline model LR achieved 87.2% prediction accuracy and a F-score of 0.87 with relative low recall of 0.82 for failed disks. Both prediction accuracy and F-score are improved in SVM, XGBoost, and RNN. Specifically, XGBoost and RNN achieved 94.9% and 94.1%, respectively with F-scores above 94%. More importantly, recalls of failed disks are improved to above 93% to greatly reduce the risk of false negative prediction. Finally, proactive prediction capability is explored to provide replacement suggestion in advance. RNN is able to keep prediction accuracy above 70% with 9 days of proactive time.
منابع مشابه
Using Byzantine Quorum Systems to Manage Confidential Data∗
This paper addresses the problem of using proactive cryptosystems for generic data storage and retrieval. Proactive cryptosystems provide high security and confidentiality guarantees for stored data, and are capable of withstanding attacks that may compromise all the servers in the system over time. However, proactive cryptosystems are unsuitable for generic data storage uses for two reasons. F...
متن کاملSimulation of fire stations resources considering the downtime of machines: A case study
Considering the increasing growth of cities, population and urban fabric density, it seems necessary that emergency facilities and services such as fire stations are positioned optimally so that they can fulfill the demands well. The aim of this study is the optimization of equipment use in the fire stations, minimization the time to arrive at the incident through management of referral call to...
متن کاملPower-aware Proactive Storage-tiering Management for High-speed Tiered-storage Systems
Large-scale high-speed mass-storage systems account for a large part of the energy consumed at data centers. To conserve energy consumed by these storage systems, we propose a high-speed tiered-storage system with a poweraware proactive method of storage-tiering management that minimizes loss of performance, which we have called the energy-efficient High-speed Tiered-Storage system (eHiTS). eHi...
متن کاملProactive and Adaptive Data Migration in Hierarchical Storage Systems using Reinforcement Learning Agent
With the data generation rates growing exponentially, businesses are having a difficult time maintaining data center infrastructure. Hierarchical storage systems has evolved as a better alternate to managing data, as frequently accessed data is placed on higher tiers and the least frequently accessed data on lower tiers. But the data arrangement is not always static. Data Migration is an operat...
متن کاملHospitals’ Readiness to Implement Clinical Governance
Background Quality of health services is one of the most important factors for delivery of these services. Regarding the importance and vital role of quality in the health sector, a concept known as “Clinical Governance” (CG) has been introduced into the health area which aims to enhance quality of health services. Thus, this study aimed to assess private and public hospitals’ readiness to impl...
متن کامل